智能论文笔记

Broad Recommender System: An Efficient Nonlinear Collaborative Filtering Approach

Ling Huang , Can-Rong Guan , Zhen-Wei Huang , Yuefang Gao , Yingjie Kuang , Chang-Dong Wang , C. L. Philip Chen

分类：机器学习

2022-04-20

最近，深度神经网络（DNN）已被广泛引入协作过滤（CF），以产生更准确的建议结果，因为它们可以捕获项目和用户之间复杂的非线性关系的能力。计算复杂性，即消耗很长的培训时间并存储大量可训练的参数。为了解决这些问题，我们提出了一种新的广泛推荐系统，称为“广泛协作过滤”（BRODCF），这是一种有效的非线性协作过滤方法。广泛的学习系统（BLS）代替DNN，用作映射功能，以学习用户和项目之间复杂的非线性关系，这些功能可以避免上述问题，同时达到非常令人满意的建议性能。但是，直接将原始评级数据馈送到BLS不可行。为此，我们提出了一个用户项目评分协作矢量预处理程序，以生成低维用户信息输入数据，该数据能够利用最相似的用户/项目的质量判断。在七个基准数据集上进行的广泛实验证实了所提出的广播算法的有效性

translated by 谷歌翻译

DDH-QA: A Dynamic Digital Humans Quality Assessment Database

Zicheng Zhang , Yingjie Zhou , Wei Sun , Wei Lu , Xiongkuo Min , Yu Wang , Guangtao Zhai

分类：计算机视觉

2022-12-24

In recent years, large amounts of effort have been put into pushing forward the real-world application of dynamic digital human (DDH). However, most current quality assessment research focuses on evaluating static 3D models and usually ignores motion distortions. Therefore, in this paper, we construct a large-scale dynamic digital human quality assessment (DDH-QA) database with diverse motion content as well as multiple distortions to comprehensively study the perceptual quality of DDHs. Both model-based distortion (noise, compression) and motion-based distortion (binding error, motion unnaturalness) are taken into consideration. Ten types of common motion are employed to drive the DDHs and a total of 800 DDHs are generated in the end. Afterward, we render the video sequences of the distorted DDHs as the evaluation media and carry out a well-controlled subjective experiment. Then a benchmark experiment is conducted with the state-of-the-art video quality assessment (VQA) methods and the experimental results show that existing VQA methods are limited in assessing the perceptual loss of DDHs. The database will be made publicly available to facilitate future research.

translated by 谷歌翻译

Investigation of Network Architecture for Multimodal Head-and-Neck Tumor Segmentation

Ye Li , Junyu Chen , Se-in Jang , Kuang Gong , Quanzheng Li

分类：计算机视觉

2022-12-21

Inspired by the recent success of Transformers for Natural Language Processing and vision Transformer for Computer Vision, many researchers in the medical imaging community have flocked to Transformer-based networks for various main stream medical tasks such as classification, segmentation, and estimation. In this study, we analyze, two recently published Transformer-based network architectures for the task of multimodal head-and-tumor segmentation and compare their performance to the de facto standard 3D segmentation network - the nnU-Net. Our results showed that modeling long-range dependencies may be helpful in cases where large structures are present and/or large field of view is needed. However, for small structures such as head-and-neck tumor, the convolution-based U-Net architecture seemed to perform well, especially when training dataset is small and computational resource is limited.

translated by 谷歌翻译

PaletteNeRF: Palette-based Appearance Editing of Neural Radiance Fields

Zhengfei Kuang , Fujun Luan , Sai Bi , Zhixin Shu , Gordon Wetzstein , Kalyan Sunkavalli

分类：计算机视觉

2022-12-21

Recent advances in neural radiance fields have enabled the high-fidelity 3D reconstruction of complex scenes for novel view synthesis. However, it remains underexplored how the appearance of such representations can be efficiently edited while maintaining photorealism. In this work, we present PaletteNeRF, a novel method for photorealistic appearance editing of neural radiance fields (NeRF) based on 3D color decomposition. Our method decomposes the appearance of each 3D point into a linear combination of palette-based bases (i.e., 3D segmentations defined by a group of NeRF-type functions) that are shared across the scene. While our palette-based bases are view-independent, we also predict a view-dependent function to capture the color residual (e.g., specular shading). During training, we jointly optimize the basis functions and the color palettes, and we also introduce novel regularizers to encourage the spatial coherence of the decomposition. Our method allows users to efficiently edit the appearance of the 3D scene by modifying the color palettes. We also extend our framework with compressed semantic features for semantic-aware appearance editing. We demonstrate that our technique is superior to baseline methods both quantitatively and qualitatively for appearance editing of complex real-world scenes.

translated by 谷歌翻译

FedTADBench: Federated Time-Series Anomaly Detection Benchmark

Fanxing Liu , Cheng Zeng , Le Zhang , Yingjie Zhou , Qing Mu , Yanru Zhang , Ling Zhang , Ce Zhu

分类：机器学习

2022-12-19

Time series anomaly detection strives to uncover potential abnormal behaviors and patterns from temporal data, and has fundamental significance in diverse application scenarios. Constructing an effective detection model usually requires adequate training data stored in a centralized manner, however, this requirement sometimes could not be satisfied in realistic scenarios. As a prevailing approach to address the above problem, federated learning has demonstrated its power to cooperate with the distributed data available while protecting the privacy of data providers. However, it is still unclear that how existing time series anomaly detection algorithms perform with decentralized data storage and privacy protection through federated learning. To study this, we conduct a federated time series anomaly detection benchmark, named FedTADBench, which involves five representative time series anomaly detection algorithms and four popular federated learning methods. We would like to answer the following questions: (1)How is the performance of time series anomaly detection algorithms when meeting federated learning? (2) Which federated learning method is the most appropriate one for time series anomaly detection? (3) How do federated time series anomaly detection approaches perform on different partitions of data in clients? Numbers of results as well as corresponding analysis are provided from extensive experiments with various settings. The source code of our benchmark is publicly available at https://github.com/fanxingliu2020/FedTADBench.

translated by 谷歌翻译

RT-1: Robotics Transformer for Real-World Control at Scale

Anthony Brohan , Noah Brown , Justice Carbajal , Yevgen Chebotar , Joseph Dabis , Chelsea Finn , Keerthana Gopalakrishnan , Karol Hausman , Alex Herzog , Jasmine Hsu

分类：机器人 | 人工智能 | 自然语言处理 | 计算机视觉 | 机器学习

2022-12-13

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. While this capability has been demonstrated in other fields such as computer vision, natural language processing or speech recognition, it remains to be shown in robotics, where the generalization capabilities of the models are particularly critical due to the difficulty of collecting real-world robotic data. We argue that one of the keys to the success of such general robotic models lies with open-ended task-agnostic training, combined with high-capacity architectures that can absorb all of the diverse, robotic data. In this paper, we present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties. We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks. The project's website and videos can be found at robotics-transformer.github.io

translated by 谷歌翻译

Instrumental Variables in Causal Inference and Machine Learning: A Survey

Anpeng Wu , Kun Kuang , Ruoxuan Xiong , Fei Wu

分类：机器学习 | 人工智能

2022-12-12

Causal inference is the process of using assumptions, study designs, and estimation strategies to draw conclusions about the causal relationships between variables based on data. This allows researchers to better understand the underlying mechanisms at work in complex systems and make more informed decisions. In many settings, we may not fully observe all the confounders that affect both the treatment and outcome variables, complicating the estimation of causal effects. To address this problem, a growing literature in both causal inference and machine learning proposes to use Instrumental Variables (IV). This paper serves as the first effort to systematically and comprehensively introduce and discuss the IV methods and their applications in both causal inference and machine learning. First, we provide the formal definition of IVs and discuss the identification problem of IV regression methods under different assumptions. Second, we categorize the existing work on IV methods into three streams according to the focus on the proposed methods, including two-stage least squares with IVs, control function with IVs, and evaluation of IVs. For each stream, we present both the classical causal inference methods, and recent developments in the machine learning literature. Then, we introduce a variety of applications of IV methods in real-world scenarios and provide a summary of the available datasets and algorithms. Finally, we summarize the literature, discuss the open problems and suggest promising future research directions for IV methods and their applications. We also develop a toolkit of IVs methods reviewed in this survey at https://github.com/causal-machine-learning-lab/mliv.

translated by 谷歌翻译

ConfounderGAN: Protecting Image Data Privacy with Causal Confounder

Qi Tian , Kun Kuang , Kelu Jiang , Furui Liu , Zhihua Wang , Fei Wu

分类：计算机视觉 | 人工智能

2022-12-04

The success of deep learning is partly attributed to the availability of massive data downloaded freely from the Internet. However, it also means that users' private data may be collected by commercial organizations without consent and used to train their models. Therefore, it's important and necessary to develop a method or tool to prevent unauthorized data exploitation. In this paper, we propose ConfounderGAN, a generative adversarial network (GAN) that can make personal image data unlearnable to protect the data privacy of its owners. Specifically, the noise produced by the generator for each image has the confounder property. It can build spurious correlations between images and labels, so that the model cannot learn the correct mapping from images to labels in this noise-added dataset. Meanwhile, the discriminator is used to ensure that the generated noise is small and imperceptible, thereby remaining the normal utility of the encrypted image for humans. The experiments are conducted in six image classification datasets, consisting of three natural object datasets and three medical datasets. The results demonstrate that our method not only outperforms state-of-the-art methods in standard settings, but can also be applied to fast encryption scenarios. Moreover, we show a series of transferability and stability experiments to further illustrate the effectiveness and superiority of our method.

translated by 谷歌翻译

Exploring Faithful Rationale for Multi-hop Fact Verification via Salience-Aware Graph Learning

Jiasheng Si , Yingjie Zhu , Deyu Zhou

分类：自然语言处理 | 人工智能

2022-12-02

The opaqueness of the multi-hop fact verification model imposes imperative requirements for explainability. One feasible way is to extract rationales, a subset of inputs, where the performance of prediction drops dramatically when being removed. Though being explainable, most rationale extraction methods for multi-hop fact verification explore the semantic information within each piece of evidence individually, while ignoring the topological information interaction among different pieces of evidence. Intuitively, a faithful rationale bears complementary information being able to extract other rationales through the multi-hop reasoning process. To tackle such disadvantages, we cast explainable multi-hop fact verification as subgraph extraction, which can be solved based on graph convolutional network (GCN) with salience-aware graph learning. In specific, GCN is utilized to incorporate the topological interaction information among multiple pieces of evidence for learning evidence representation. Meanwhile, to alleviate the influence of noisy evidence, the salience-aware graph perturbation is induced into the message passing of GCN. Moreover, the multi-task model with three diagnostic properties of rationale is elaborately designed to improve the quality of an explanation without any explicit annotations. Experimental results on the FEVEROUS benchmark show significant gains over previous state-of-the-art methods for both rationale extraction and fact verification.

translated by 谷歌翻译

Multiscale Graph Neural Networks for Protein Residue Contact Map Prediction

Kuang Liu , Rajiv K. Kalia , Xinlian Liu , Aiichiro Nakano , Ken-ichi Nomura , Priya Vashishta , Rafael Zamora-Resendizc

分类：人工智能 | 机器学习

2022-12-02

Machine learning (ML) is revolutionizing protein structural analysis, including an important subproblem of predicting protein residue contact maps, i.e., which amino-acid residues are in close spatial proximity given the amino-acid sequence of a protein. Despite recent progresses in ML-based protein contact prediction, predicting contacts with a wide range of distances (commonly classified into short-, medium- and long-range contacts) remains a challenge. Here, we propose a multiscale graph neural network (GNN) based approach taking a cue from multiscale physics simulations, in which a standard pipeline involving a recurrent neural network (RNN) is augmented with three GNNs to refine predictive capability for short-, medium- and long-range residue contacts, respectively. Test results on the ProteinNet dataset show improved accuracy for contacts of all ranges using the proposed multiscale RNN+GNN approach over the conventional approach, including the most challenging case of long-range contact prediction.

translated by 谷歌翻译